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Abstract A widely used land surface model, the Variable Infiltration Capacity (VIC) model, is coupled 
with a newly developed hierarchical dominant river tracing-based runoff-routing model to form the Domi- 
nant river tracing-Routing Integrated with VIC Environment (DRIVE) model, which serves as the new core of 
the real-time Global Flood Monitoring System (GFMS). The GFMS uses real-time satellite-based precipitation 
to derive flood monitoring parameters for the latitude band 50°N-50°S at relatively high spatial (~1 2 km) 
and temporal (3 hourly) resolution. Examples of model results for recent flood events are computed using 
the real-time GFMS (http://flood.umd.edu). To evaluate the accuracy of the new GFMS, the DRIVE model is 
run retrospectively for 1 5 years using both research-quality and real-time satellite precipitation products. 
Evaluation results are slightly better for the research-quality input and significantly better for longer dura- 
tion events (3 day events versus 1 day events). Basins with fewer dams tend to provide lower false alarm 
ratios. For events longer than three days in areas with few dams, the probability of detection is ~0.9 and 
the false alarm ratio is ~0.6. In general, these statistical results are better than those of the previous system. 
Streamflow was evaluated at 1121 river gauges across the quasi-global domain. Validation using real-time 
precipitation across the tropics (30°S-30°N) gives positive daily Nash-Sutcliffe Coefficients for 107 out of 
375 (28%) stations with a mean of 0.19 and 51% of the same gauges at monthly scale with a mean of 0.33. 
There were poorer results in higher latitudes, probably due to larger errors in the satellite precipitation 
input. 


1. Introduction 

Floods are a leading natural disaster with worldwide, significant, negative social-economic impacts. Accord- 
ing to World Disaster Report [201 2], floods and associated landslides caused more than 55% (2000) of a total 
of 3600 significant natural disasters during 2002-201 1 over the globe; they killed over 65,000 people, 
affected over 1.1 billion people and cost an estimated $280 billion (US Dollars in 201 1). Most of these disas- 
ters occurred in densely populated and under-developed areas where an effective flood monitoring and 
forecasting system is lacking due to insufficient resources [Wu et al., 201 2a]. A reliable flood monitoring and 
forecasting system at a global scale is extremely desirable to a variety of national and international agencies 
for humanitarian response, hazard mitigation, and management. Satellite remote sensing has opened a 
new era to pursue global flood estimation (particularly important for remote and trans-boundary areas) by 
providing: (1) flood extent mapping via direct observations using optical [e.g., Brakenridge, 2006; Ordoyne 
and Friedl, 2008] or Synthetic Aperture Radar imagery [e.g., Horritt et al., 2003; Mason et al., 201 2]; and (2) 
flood monitoring and forecasting through the use of hydrologic models and observational inputs for precip- 
itation, land cover, vegetation, topography, hydrography, etc. [e.g., Shrestha et al., 2008; Wu et al., 2012a, 
Alfieri et al., 201 3], which is the subject of this paper. 

Rainfall estimation is the most critical meteorological input of a hydrologic model for real-time flood estima- 
tion, and can be obtained through satellite remote sensing with reliable availability at relatively high spatial- 
temporal resolution and short lag time (hours). One such satellite-based precipitation product, the National 
Aeronautics and Space Administration (NASA), Tropical Rainfall Measuring Mission (TRMM), Multi-satellite Pre- 
cipitation Analysis (TMPA) [Huffman et al., 2007], has been successfully applied in many hydrologic modeling 
applications [e.g., Harris et al., 2007; Su et al., 2008, 201 1], The TMPA precipitation products are composed of 
multiple satellite estimates calibrated, or adjusted, to the information from the TRMM satellite itself, which car- 
ries both a radar and passive microwave sensor. An experimental Global Food Monitoring System (GFMS) 


WU ET AL. 


©2014. American Geophysical Union. All Rights Reserved. 


2693 


®AGU Water Resources Research 


1 0. 1 002/20 13WR0 147 10 


using the real-time version of the TMPA precipitation information (3 h, with ~6 h lag, 0.25° latitude-longitude 
resolution) for quasi-global (50°S-50°N) coverage was developed and improved [Hong et at., 2007; Yilmaz 
et ai, 201 0; Wang et a!., 201 1 ; Wu et a!., 201 2a] and has been running routinely for the last few years providing 
useful results for a number of organizations. Currently, this real-time flood estimation system is often the only 
source of quantitative information during significant flood events, when information is needed for relief 
efforts by humanitarian agencies, such as United Nations Office for the Coordination of Humanitarian Affairs 
(OCHA) and United Nations World Food Programme (WFP). 

Evaluations of various hydrologic model-based flood estimation calculations using satellite precipitation 
data have been conducted with positive performances at local and regional scales [e.g., Shrestha et ai, 

2008; Pan et at., 2010; Su et at., 2008, 201 1]. On a larger, global scale, Wu etai [2012a] evaluated the previous 
version of the GFMS, which was based on a grid-based hydrologic model [Wang et ai, 2011], driven by 
TMPA 3B42V6 research (nonreal-time) rainfall product. They examined the performance in flood event 
detection against available flood inventories, showing that the GFMS flood detection performance improves 
with longer flood durations and larger affected areas. The presence of dams tended to result in more false 
alarms and longer false alarm duration. The statistics for this previous system for flood durations greater 
than 3 days and for areas without dams were around a probability of detection (POD) of ~0.70 and a false 
alarm ratio (FAR) of ~0.65 [Wu et ai, 201 2a]. 

These evaluations of our previous systems [Yilmaz et ai, 201 0; Wu et ai, 201 2a] indicated pathways toward an 
improved approach with greater flexibility and accuracy. The key areas for potential improvement included 
consideration of subgrid hydrologic processes, inclusion of cold season processes and improved routing that 
could lead to two-way interaction between the land surface processes and the routing calculations. A land 
surface model (LSM) can be used to effectively calculate land surface and subsurface runoff through its verti- 
cal water-energy processes, partitioning precipitation into infiltration, evapotranspiration, and runoff compo- 
nents. However, a lateral process for runoff-routing is usually lacking within most LSMs, though an efficient 
and accurate runoff-routing scheme can have significant impacts on delineation of river basin water and 
energy budgets [Decharme et ai, 201 1], and be critically important for flood simulation. For LSMs, such as the 
Variable Infiltration Capacity (VIC) model [Liang et at., 1994, 1996], the traditional cell-to-cell or source-sink 
routing models based on widely used Unit Hydrograph methods, e.g., Lohmann et ai [1996] and Wu et ai 
[2012c] can be used to successfully simulate streamflow by postprocessing the LSM runoff output. However, it 
is difficult (if even possible) to couple this type of routing model with an LSM (with feedbacks to the LSM 
online) for global-scale real-time flood calculation. This is because the convolution algorithms have to incorpo- 
rate all upstream runoff information for multiple previous time steps to determine the streamflow for a spe- 
cific downstream grid cell at a time step. For this study, we developed a new hydrologic module for the GFMS 
by coupling the widely used VIC land surface model with a recently developed physically based hierarchical 
dominant river tracing [Wu et ai, 2011, 2012b] based runoff-Routing (DRTR) model. This new coupled system, 
the dominant river-tracing routing integrated with VIC Environment (DRIVE) model, is intended to provide 
improved global results and increased flexibility for implementation of future improvements. 

In this paper, we describe this new DRIVE-based version of the GFMS and evaluate the performance of the 
system on a global basis against streamflow observations and flood event archives, using satellite precipita- 
tion information from both the real-time and research products. Section 2 of this paper describes the meth- 
odology, particularly on the DRIVE coupled model system; section 3 outlines the model data inputs and 
parameterization; section 4 focuses on the model evaluation; and conclusions and future work are pre- 
sented in section 5. 


2. Methodology 

The new real-time GFMS (http://flood.umd.edu) combines the satellite-based precipitation estimation, run- 
off generation, runoff routing, and flood identification using the DRIVE coupled model system described in 
detail in sections 2.1 and 2.2. 

2.1. Variable Infiltration Capacity (VIC) Model 

Hydrologically oriented LSMs, such as the VIC model, solve for full water and energy balances with good 
skill for water budget estimation [Peters-Lidard et ai, 201 1], We selected the VIC model as a critical part of 
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our GFMS for two additional reasons. First, significant community development has been carried out, and 
continued improvement will be maximized by being part of this larger community of land surface model 
development and testing. The VIC model has been successfully applied for many hydrologic simulations 
and water resource manangement studies, including flooding [e.g., Hamlet and Lettenmaier, 2007; Hamlet 
et al., 2010; Eisner et a!., 2010; Voisin et at., 201 1], Through these studies, the VIC model has been generally 
well parameterized across the globe and thus provides a good starting point for global applications such as 
this study. Second, the VIC model includes a module for snow and soil frost dynamics [ Storck et al., 2002; 
Cherkauer and Lettenmaier, 2003], with good validation against streamflow observations in many snowmelt- 
dominated basins, particularly in mountainous areas [ Christensen et al., 2004; Christensen and Lettenmaier, 
2007; Hamlet et al., 2005; Eisner et al., 2010; Wu et al., 2012c]. This will benefit the GFMS in forecasting spring 
streamflow and snowmelt-related floods and allow us to estimate floods in a large part of the globe with 
snowmelt-dominant basins. 

Representation of complex physical processes at a spatial resolution commensurate with LSMs through sub- 
grid process is a good strategy to balance data availability, heavy computing loads, and model accuracy. 
Inclusion of subgrid processes is a major feature of the VIC model contributing to its good performance in 
runoff generation calculations. The VIC model considers the subgrid heterogeneity of infiltration capacity 
through statistical variable infiltration curves [Zhao and Liu, 1995], which have been demonstrated to work 
very well for large-scale applications [Sivapalan and Woods, 1995]. The VIC model also considers subgrid 
parameterization and processes on fractional subgrid areas for different land cover types and elevation 
bands. To use the VIC model for real-time runoff prediction, we made a significant effort to modify the VIC 
model from its original individual grid cell-based mode to a mode that is able to simulate spatially distrib- 
uted runoff at each time step, i.e., computing all the grid boxes at each time step. The modification was per- 
formed on the version of the VIC model (v4.1.1) in an efficient way without changing model physics, so that 
we can conveniently update our modified VIC model periodically using the updates from the VIC model 
community. 

2.2. Dominant River Tracing-Based Runoff-Routing (DRTR) Model and Coupling with VIC Model 

For clarity, the term "runoff" hereafter stands for the excess water generated in each grid cell for routing 
with units of depth (mm), while "streamflow" and "discharge" are used interchangeably to indicate the 
routed flows in the channel/floodplain network with units of [m 3 /s]. The function of a routing model is to 
transport water (runoff) downstream in a river basin system until the river empties into the ocean or a lake. 
A routing model consists of two main components: (1) the description of the river basin drainage system, 
i.e., simplifying the basin drainage system into a parameterized concept and (2) the physical and numerical 
models for computer simulation of streamflow and other variables with appropriate assumptions commen- 
surate with the simplifications in the drainage basin concept. Recently developed and relatively advanced 
physically based routing schemes for large-scale applications [e.g., Decharme et al., 201 1; Yamazaki et al., 
2011; Li et al., 2013] usually deploy similar governing equations taken from various forms of the classic St- 
Venant equations based on mass and momentum conservation, often using the kinematic wave and diffu- 
sion wave methods. The essential differences among routing models of this type lie in the levels at which a 
drainage system is abstracted and simplified, and the techniques used for parameterizing each element 
within the model conception. 

In this study, we implemented a physically based routing model based on the hierarchical DRT method [Wu 
et al., 201 1 , 201 2b], which includes a package of hydrographic upscaling (from fine spatial resolution to 
coarse resolution) algorithms and resulting global data sets (flow direction, river network, drainage area, 
flow distance, slope, etc.) especially designed for large-scale hydrologic modeling. This DRT-based runoff- 
Routing (DRTR) model is grid based and convenient for coupling with the modified gridded VIC model to 
simulate spatially distributed streamflow. 

2.2.1. The DRTR Model Concept and Parameterization 

Recently developed grid-based, large-scale (coarser resolution) routing models usually conceptualize a 
drainage system as connected stem rivers at grid resolution, but with major differences in subgrid process 
(routing) delineation. Given the generally well-established mathematics and physics for land surface routing 
simulation, the major challenge to implementing a large-scale routing scheme lies in obtaining accurate 
parameterization of the model elements (particularly at subgrid scale). For example, a recent large-scale 
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— High resolution baseline river 
— DRTUpscaledriver 
Ticks for dominant river intervals 
-> Overland flow into tributaries 

-> Overland flow directly 

Into dominant river (dark blue line) 


Figure I.The DRTR routing model concept on river basin drainage system at (a) grid and (b-d) subgrid scales using a real river basin 
(Mbemkuru river basin, Southeast of Tanzania) as example. The light blue lines in Figure la is the baseline high-resolution (1 km) river net- 
work from HydroSHEDS and the red lines are the DRT-derived coarse-resolution rivers (1 /8th degree in this case). 


routing model on a grid basis [Li ef al., 2013], deploying a kinematic wave type routing method, conceptual- 
ized the routing process by using a hypothetical subgrid channel to link hillslopes and stem rivers which 
has a transport capacity equivalent to all tributaries combined, while linking the grids via the stem river net- 
work derived by the DRT upscaling algorithm by Wu ef al. [201 1, 2012b]. Due to the scale-consistent stem 
river network derived by the DRT algorithm and the scale-consistent subgrid routing parameterization, this 
large-scale routing model showed a consistent model performance across different spatial resolutions [Li 
ef al., 2013], 

In this study, we implemented the DRTR routing model using a drainage system concept similar to Li ef al. 
[2013], but with differences in subgrid parameterization using the full strength of the DRT algorithms to 
allow more-detailed high resolution subgrid information that is aggregated for coarser resolution routing 
simulation and for numeric solutions of the governing equations. Under the gridded DRT framework, the 
hydrologic system of each river basin is conceptualized as a hierarchically connected hillslope-river-lake or 
ocean system. All grid cells are connected via the predominant river (or flow path) running through the grid 
cell, which forms the major drainage network for the river basin (Figure 1 a, red lines). For coarser spatial 
resolution (e.g., coarser than 1 km) hydrologic modeling, the DRT derives the predominant river (red lines) 
from the fine-resolution river network [blue lines; Wu ef al., 201 1]. Figure 1b shows a typical real drainage 
system within an individual grid cell, represented by high-resolution river network data, with one predomi- 
nant river (dark blue) collecting runoff from tributaries (light blue) and overland areas (blank), which is con- 
ceptualized as in Figure 1 c with simplified subgrid tributaries (light blue lines). At the subgrid scale, the 
predominant river within each grid cell is divided into one or multiple river intervals (Figures 1 c and 1 d, pur- 
ple ticks). Each dominant river interval can have one "effective tributary" (Figures 1 c and 1 d, light blue lines) 
collecting runoff from its overland contributing area even if there are multiple tributaries (defined from 
high resolution river network) connected to the dominant river interval. All secondary dominant rivers [Wu 
ef al., 201 1] within a coarse grid cell, if any, are treated as tributaries. The overland area of each grid cell is 
divided into two parts: (1 ) areas nearby the dominant river and directly contributing runoff to the dominant 
river through overland flow (Figure 1 d, dark blue arrows); (2) areas contributing to the dominant river 
through tributaries (Figure 1 d, light blue arrows). Within each grid cell, runoff generated on hillslopes is 
routed to its corresponding tributary through overland flow and then is treated as channel flow to enter the 
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relevant dominant river interval. The overland flow and the tributary flow are treated as evenly distributed 
along the tributary and predominant river interval as lateral flow input, respectively. Once water enters the 
dominant river intervals, the river routing calculations follow the hierarchical dominant river ordering 
sequence in the major river network. Floodplain, reservoir, and lake elements are not included in the current 
model. 

All the elements (hillslope, tributary, and predominant river) (Figure 1) are identified and parameterized by 
the DRT on a pixel-to-pixel basis tracing from the finer resolution river network (or flow path). In this study 
(model running at 1 /8th degree resolution), we set the number of "effective tributaries" of each grid cell to 
one, while parameterizing the effective tributary (including tributary length, slope, width, etc.) using the 
value averaged from all tributaries within that grid cell as shown in Figure 1 b. The channel width is esti- 
mated by an empirical relation to corresponding drainage area. The overland area within a grid cell directly 
contributing runoff to the corresponding dominant river is identified first using the DRT from high resolu- 
tion flow direction map and the remaining area of the grid cell is assigned to the effective tributary. The 
DRT also uses the Strahler ordering system [Strahler, 1957] to define a hierarchical drainage network topol- 
ogy, e.g., for the upstream-downstream relationships and conjunctions connecting different river reaches. 
The model structure, based on the Strahler ordering system, is efficient for integrating numerical calcula- 
tions established on each individual element for a better approximation of the characteristics of natural 
hierarchical runoff propagation. 


2.2.2. DRTR Routing Scheme Governing Equations and Numeric Solutions 

With the comprehensive parameterization provided by the DRT, the routing scheme can conveniently 
deploy different governing equations and numeric solutions to individual routing elements. In this 
study, we present a relatively simple method, i.e., applying the kinematic wave equations to both 
dominant rivers at grid level and tributaries at subgrid level, while assuming the overland surface run- 
off and base flow enter the corresponding dominant river intervals and tributaries within each time 
step. 

Rectangular cross section is assumed for all channels. Equations (1-3) are the governing equations adopted 
for the kinematic wave method [ Chow et al., 1988]: 


Continuity equation : — -F— -=q t (1) 

at ox 

Momentum equation : Sf=S 0 (2) 

S 1/2 

Manning equation : Q=^yj^ S/3 (3) 

where f is the time [s], x is the longitudinal flow distance (m), A is the wetted area (m 2 ) defined as the chan- 
nel cross-section area below the water surface, and P is the wetted perimeter (m). S f is the friction slope 
which incorporates the impacts of the gravity force, friction force, inertia force, and other forces on the 
water. If the topography is steep enough, the gravity force dominates over the others, and S f can be 
approximated by the channel bottom slope 5 0 , which is the basic assumption for kinematic wave routing 
approaches [ Chow et al., 1988]. In equation (3), n is Manning's roughness coefficient, which is not directly 
measurable, but mainly controlled by surface roughness, type of bottom material, and sinuosity of the flow 
path. In this study, we applied a constant value of 0.03 globally for both predominant rivers and subgrid 
tributaries, although eventually it should be calibrated for local river basins. Q is the streamflow and dis- 
charge (m 3 /s) and q L is the lateral discharge in unit width (m 3 /s/m). The backward differential scheme of the 
equation (1) is 


/\n+'\_A n fyi+l _nn+1 
y v+1 /1 /+ 1 i *</- 1-1 

At + Ax qL 


(4) 


where / and n are the spatial and temporal indexes, respectively. Rewriting the Manning equation, equation 
(3), =ol(QP^ Y and A” +1 =a(QJ’ +1 /, substituting in equation (4) we get 
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(5) 


where a=(r)P 2 / 3 / v / So) 0 ' 6 and /? = 0.6. The right side of equation (5) is known, and the Newton-iterative 
method is used to solve the unknown . The same numeric solution is also used for estimating channel 
water depth (mm) and thus for routed runoff (or land surface water storage, (mm)) calculations. 

2.2.3. The Coupling of the DRTR Routing Model With the VIC Model 

The vertical model processes of the VIC model run are calculated separately for each subgrid area before 
they are aggregated to a grid-scale output at the end of each model time step. The routing scheme was 
implemented within the VIC model framework taking the VIC estimated runoff as input for the routing cal- 
culation of discharge and routed runoff at each time step. The VIC model was modified to match the DRTR 
routing model structure with all grid cell calculations completed at each time step in the Strahler order- 
based sequence. The routing time step can be finer than the VIC model time step assuming that the runoff 
generation by the VIC model has an even temporal distribution within each VIC model time step. 

The DRTR routing scheme, implemented within the modified VIC model, can have bidirectional interactions 
with the VIC model. However, subgrid floodplain delineation for appropriate redistribution of routed runoff 
is needed to really take advantage of the two-way coupling strategy. Therefore, in this study the routing 
scheme was used as a postprocessor for the runoff-routing after each time step from the VIC model. That is, 
there is no two-way interaction between VIC and the DRTR in the following calculations. We plan to test 
and implement this potential improvement in a future study. 

3. Model Setup and Data 

We performed the long-term TRMM era retrospective simulations by running the DRIVE combined model 
using the TMPA 3B42V7 research (which contains monthly rain gauge data, from 1998 to present) and 
TMPA 3B42V7RT real-time precipitation data (which uses only a climatological gauge correction, from 2000 
to present), at 3 hourly temporal and 1 /8th degree spatial resolutions for the latitude band 50°N-50°S. 
Other forcing data (i.e., air temperature and wind speed) were taken from the NASA Modern-Era Retrospec- 
tive analysis for Research and Applications (MERRA) reanalysis [Rienecker et al., 201 1], The phase (liquid ver- 
sus solid) of the precipitation is determined based on a simple partitioning scheme using air temperature 
within the VIC model [ Hamlet et al., 2005], For each grid cell at a time step, the satellite-based precipitation 
is assumed to be 100% snow when the air temperature is below — 0.5°C, while it is 100% rain when the 
temperature is above 0.5°C. A linear relationship is assumed between the two extremes. The quarter- 
degree resolution global soil and vegetation parameters (provided by Justin Sheffield, University of Prince- 
ton) were simply projected (pixel replication) to 1 /8th degree resolution. This data set included the recent 
updated parameters for the VIC model improved through calibration efforts [Troy et al., 2008]. The hydro- 
graphic parameters (e.g., flow direction, drainage area, flow length, channel width, channel slope, overland 
slope, flow fraction, river order) for the DRTR runoff-routing scheme were derived by applying the DRT to 
the HydroSHEDS [ Lehner et al., 2008] global 1 km baseline hydrographic data [Wu et al., 201 1, 2012b]. Based 
on the DRT algorithms, all parameters for subgrid tributaries and flow paths are derived by tracing each 
fine-resolution (i.e., 1 km) grid cell. For example, overland slope and channel (tributary and predominant 
river) slopes for a grid cell are estimated as the average slope of all overland flow paths and channel flow 
paths, respectively, within the grid cell (more details in Li et al. [2013]). Hereafter, TMPA 3B42V7 research 
and real-time precipitation products are referred to as TMPA RP and TMPA RT, respectively, while the DRIVE 
model driven by TMPA RP and TMPA RT is referred to as DRIVE-RP and DRIVE-RT respectively. A 3 year 
model spin-up run was performed (1998-2000) using the DRIVE-RP data to define the initial conditions for 
the both scenarios (DRIVE-RP and DRIVE-RT). All model results presented in this study are based on model 
parameters either estimated directly from input data (e.g., through DRT algorithms) or from the VIC commu- 
nity (e.g., soil and vegetation parameters). 

4. Model Results and Model Performance Evaluation 

In order to evaluate the new GFMS performance in flood event detection and streamflow magnitude esti- 
mation, particularly for evaluating the status of the GFMS in real-time flood estimation at the global scale, 
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we performed the following: (1 ) evaluating examples of recent flood events as seen by the real-time GFMS, 
which has been running the DRIVE model routinely at 3 hourly temporal and 1/8° spatial resolutions over 
the globe using the real-time precipitation data; (2) evaluating the system model performance using 2086 
archived flood events by Dartmouth Flood Observatory (DFO, http://floodobservatory.colorado.edu), 
according to the evaluation method used by Wu et al. [2012a]; and (3) validating against observed daily 
streamflow data from the 1121 gauges selected from the Global Runoff Data Centre (GRDC, http://grdc. 
bafg.de/) database. 

4.1 . Introduction of the Major Outputs of the DRIVE Model and the Real-Time GFMS 

The DRIVE model can calculate a large number of hydrologic variables (e.g., soil moisture, evaportranspira- 
tion, snow water equivalent), but the main focus in this paper is the two outputs from the routing model 
related directly to floods: (1) streamflow (or discharge, m 3 /s); and (2) routed runoff (or surface water stor- 
age), which is the water depth (mm) at each grid cell on a dry ground basis, and statistical thresholds which 
were used for defining flood occurrence and intensity. According to Wu et al. [2012a], each grid cell is deter- 
mined to be flooding at a time step when the routed runoff is greater than the flood threshold of that grid 
cell. In this study, we calculated the flood threshold at each grid cell, based on the 1 1 year (2001-201 1) 
DRIVE model retrospective simulation results, using the method from Wu et al. [2012a] with a slight modifi- 
cation, to make it relatively more reliable and easier to implement. Specifically, a grid cell is determined to 
be flooding when R > P 95 + 0.5 * S and Q > 1 0, where R is the routed runoff (mm) of that grid cell at a time 
step; P 95 and 5 are the 95th percentile value and the temporal standard deviation of the routed runoff 
derived from the retrospective simulation time series at the grid cell; and Q is the corresponding value of 
discharge (m 3 /s). 

By applying the flood threshold map to (subtracted from) the DRIVE model simulated routed runoff, the 
flood detection, and intensity (i.e., the water depth above flood threshold (mm)) is estimated for each grid 
cell of the globe at each time step. The real-time model results and precipitaton background information 
can be accessed at http://flood.umd.edu. Examples (screenshots) of the real-time GFMS major outputs 
(routed runoff, streamflow, and flood detection/intensity) are shown in Figures 2a-2c. An example of global 
TMPA 3B42 real-time rainfall input data (quarter degree) at a same time interval is also shown in Figure 2d. 
For the flood detection/intensity parameter (depth above threshold), Figure 2c shows the evolution (at a 
daily interval) of the flood event in North India (north subbasins of Ganges River Basin) during 15 June 2013 
to 20 June 201 3. To interpret the flood detection and intensity results (Figure 2c), areas with more than ~30 
mm above the threshold (starting with blue) are ususally considered having significant flood, while other 
potential areas (i.e., Figure 2c, green and light blue) with lower flood intensity indicate a possible develop- 
ing flood. A wide-spread lower flood intensity usually occurs as a response to wide-spread rainfall events, 
often indicating a coming flood wave in downstream areas at a later time, which can serve as a warning sig- 
nal. The North India floods were reported as killing more than 1000 people. The GFMS generally captured 
the events but the accuracy was not validated because of the lack of observed data in real time for this 
case. 

4.2. Recent Floods in Mississippi Upstream Subbasin Rivers 

Upstream subbasins of the Mississippi River in Iowa, llinois, Missouri, Indiana, Ohio, and Kentucky flooded 
during April-June 2013 (Figures 3 and 4), with the location indicated in Figure 2 as a red rectangle over the 
USA. The GFMS output successfully captured the occurrence of these events according to information from 
the Dartmouth Flood Observatory and the media (see flooding at Des Plaines, IL on 19 April 2013 in photo- 
graph in Figure 4). Figures 3a and 3b show the snapshots of the GFMS estimated flood detection and inten- 
sity parameter for the two major flood waves from Mississippi upstream tributary rivers originating in mid- 
April and early-June 2013, respectively. Both flood events were caused by wide-spread precipitation in this 
area as shown in Figures 3c and 3d with previous 7 day accumulated precipitation prior to the flooding 
time (i.e., 09Z18Apr2013 and 09Z02Jun2013, respectively). Meanwhile, the spatially distributed streamflow 
information is also shown in Figures 3e and 3f. All such information and more details are available from the 
GFMS website, e.g., animations for detailed (3 hourly time step) flood evolution within river basin drainage 
systems and time series data for any grid cell of interest. 

In order to quantitatively validate the real-time GFMS performance in simulating these flood events, we 
compared the real-time calculations with 29 USGS streamflow gauges from the USGS WaterWatch program 
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Figure 2. Example of the DRIVE model major outputs from the real-time GFMS with screenshots from http://flood.umd.edu. The examples show the model global outputs of (a) routed 
runoff, (b) streamflow, (c) flood detection and intensity (water depth (mm) above flood threshold) at a 3 h time interval (1 5Z01 Jul201 3). An example of global TMPA 3B42V7 real-time 
rainfall input data at the same time interval is shown in Figure 2d. The example also shows the spatial-temporal evolution (at daily interval) of the flood event happened in North India 
during 1 5 June 201 3 to 20 June 201 3 (cl -c6). 


(http://waterwatch.usgs.gov; Figure 4a, filled circles) within the flood affected area (along the Iowa, Cedar, 
Wabash, llinoise, Ohio, Misouri, and Mississipi Rivers). The upstream drainage areas of these gauges range 
from 2884 to 1,772,548 km 2 . According to the metrics calculated based on the 2 year retrospective period 
(6 December 201 1 to 6 December 2013), there were 41% (12) out of 29 gauges showing positive daily 
Nash-Sutcliffe coefficient (NSC) [Nash and Sutcliffe, 1970] values with a mean of 0.23 as indicated as green 
points (rather than black) in Figure 4a and 55% (16) of them showing positive monthly NSC values with a 
mean of 0.35. All these gauges showed fairly good correlation coefficients between observed and simulated 
streamflow with a mean of 0.55 and 0.70 at daily and monthly scale, respectively. Figure 4 also shows the 
observed and simulated daily hydrographs for four of the gauges (locations indicated in Figure 4a) during 
this Spring and early Summer flooding period (1 April to 9 June 2013). These hydrographs explain the good 
performance of the GFMS in flood occurance detection (section 4.3) as the system can generally capture 
the variation and magnitude of observed streamflow during the flooding season. There were biases in 
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Figure 3. Snapshots from the real-time GFMS (online: http://flood.umd.edu) for major two flood waves, covering April to early June, 201 3, in subbasin rivers upstream of the Mississippi 
River, including (a and b) the flood detection and intensity (water depth above flood threshold), (c and d) previous 7 day accumulated precipitation according TMPA RT, (e and f) stream- 
flow. All data are at 1 /8th (--1 2km) resolution. 


magnitude and shifts in timing as shown, but they have limited impacts on flood event detection. For these 
cases, the simulated floods tend to be faster than observed, which may be because the DRIVE model does 
not include floodplain and lake/reservoir processes. Flydrographic parameterization can also contribute to 
the timing error, e.g., overestimated channel width or underestimated surface roughness can also lead to 
faster flood waves. One can also see from Figure 4 that in these cases the model consistently underesti- 
mated the snowmelt-related streamflow in early spring, which, however, is not typical for most years in our 
long-term retrospecitve simulation (not shown). 

Overall, without model calibration and considering the impacts from man-made structures and regulated 
flow (many small dams in this area, Figure 4a), the DRIVE model using the real-time satellite precipitation 
input gives a reasonable real-time detection of flood occurance and magnitude estimation. 

4.3. Flood Event Inventory-Based Evaluation 

Following the same methodology developed and used by Wu et al. [2012a], a similar evaluation of the new 
GFMS performance in flood event detection across the globe was conducted using the same reported flood 
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Figure 4. (a) The DRIVE-RT simulated streamflow against observed data from 29 USGS gauges on the rivers of the upper Mississippi river basin for a 2 year retrospective period (6 Decem- 
ber 201 1 to 6 December 2013). All USGS gauges are shown in filled circles, while their colors are turned into green when the model-estimated positive daily NSCs at the corresponding 
locations, (b-e) The observed and simulated daily hydrographs for four of the gauges, with locations indicated in Figure 4a, during the Spring and early Summer flooding period (1 April 
to 9 June 2013). 


event databases compiled mainly from news, reports and some satellite observations by the DFO. The flood 
event database used by Wu et al. [201 2a] was extended through 201 1 using the latest DFO database. 

Based on a 2 X 2 contingency table (a = GFMS yes, reported yes; b = GFMS yes, reported no; c = GFMS no, 
reported yes; d = GFMS no, reported no), three categorical verification metrics, including probability of 
detection [POD; a/(a + c)], false alarm ratio [FAR; b/(a+b)], and critical success index [CSI; a/(a + b + c)], 
were calculated using the 1 1 year (2001-201 1) retrospective simulations from both DRIVE-RP and DRIVE-RT, 
against the DFO flood inventary for the same time period. 

4.3.1. Flood Threshold Maps by DRIVE-RP and DRIVE-RT and the Corresponding Background 
Precipitation Estimation 

The flood threshold maps used for the flood detection/intensity parameter are derived from the retrospec- 
tive runs and the formulas given in a previous section. Both the DRIVE-RP and DRIVE-RT-based flood 
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Figure 5. (a) Flood threshold map (according to routed runoff (mm)) based on 1 1 year (2001-2011) retrospective simulation by DRIVE-RT. 
(b) The difference between the flood threshold maps derived by the DRIVE-RT and DRIVE-RP (DRIVE-RT-DRIVE-RP). 


threshold maps have very similar spatial patterns and value ranges. The global flood threshold values by 
DRIVE-RP range from 0 to 14,349 mm with a mean of 17.7 mm, while the DRIVE-RT derived shreshold values 
range from 0 to 16,268 mm with a mean of 18.7 mm. Both flood threshold maps correspond well to the 
river basin drainage networks, with large values for river grid cells having large upstream drainage areas. 
Figure 5a shows the DRIVE-RT-based flood threshold map, with the difference between the thresholds for 
DRIVE-RP and DRIVE-RT shown in Figure 5b. Figure 6a shows the mean annual precipitation distribution by 
TMPA RT from the same time period (2001-201 1) and the difference map (Figure 6b) in parallel to Figure 5. 
There is a correlation coeficient of 0.98 between the two flood threshold maps, while the correlation coefi- 
cient of the two mean annual precipitation maps by TMPA RP and RT is also very high at 0.95. The global 
mean difference between the two flood threshold maps (DRIVE-RT minus DRIVE-RP) is 1 .0 mm (5.9%), while 
the mean difference in the mean annual precipitation is 49.1 mm (5.4%). Visually comparing of Figures 5b 
and 6b clearly shows that the variations in the flood threshold values in the DRIVE-RT (relative to DRIVE-RP) 
are primarily controlled by the bias distribution in the precipitation. The DRIVE-RT flood thresholds usually 
show a consistent bias against those of DRIVE-RP, either low or high, within a basin or subbasin (Figure 5b). 
For example, from Figures 5b and 6b, the DRIVE-RT flood threshold values and corresponding precipitaion 
are generally consistently higher than those of DRIVE-RP in the west-central United States (including the 
entire Missouri River basin and Colorado River basin). In contrast, they are generally lower in the eastern 
areas of the Mississippi River, with the result that flood threshold values are higher than for DRIVE-RP in the 
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Figure 6. (a) Mean annual precipitation map according to TMPA RT from 2001 to 201 1; 
(b) the difference between the mean annual precipitation for TMPA RP and RT over the 
same period. 


downstream part of the Missis- 
sippi stem river (as seen from the 
inset window in Figure 5b). A 
similar situation happens in the 
Amazon river basin, while con- 
sistent higher threshold values 
by DRIVE-RTthan DRIVE-RP were 
found in almost all Asian and 
Austrailian river basins, except for 
Southeast Asia and coastal areas. 
The entire Congo River and 
almost the entire Danube River 
basin and Nile River basin show 
lower DRIVE-RT thresholds than 
DRIVE-RP. A zoomed-in area for 
Asia of Figure 5b is also shown as 
background in Figure 7. 

Figure 6b also indicates the areas 
where improvements are needed 
for satellite-based real-time land 
precipitation estimation. The 
overestimation in the interiors of 
continents at higher latitudes 
may be related to false identifica- 
tion of surface effects as precipi- 
tation events in wintertime, while 
overestimation over the upper 
reaches of the Amazon may be 


related to overestimation of deep convective events. In coastal areas in middle latitudes the underestima- 
tion is most likely related to underestimation of shallow, orographic rainfall. Elimination of these precipita- 
tion biases will likely improve the flood statistics. 


4.3.2. Flood Event Detection Metrics 

We used the same method developed by Wu et al. [2012a] to match the simulated and reported flood 
events for the evaluation. A brief introduction of the method is given below. For more details, one can refer 
to Wu et al. [201 2a]. The DFO flood database provides the locations (latitudes/longitudes) and days of the 

reported floods. We assume the 

60° E 90° E 120” E reported flood locations are 

located in the correct river basin, 
even though they may not be 
recorded with precisely correct 
latitude and longitude coordi- 
nates. A simulated flood event 
was defined within a local spatial 
window according to the 
reported location and a 1 day 
(±24 h) buffer surrounding the 
reported flood duration. The local 
spatial domain was defined, 
based on the DRT flow direction 
map, to be composed of all grid 

Figure 7. Example of well-reported areas (shaded yellow) and their corresponding FAR cells in the upstream drainage 

metrics (according to DRIVE-RT for all floods with duration greater than 1 day) in the part . . 

of Asia that tends to have more floods. The background image is the zoomed— in flood area wlthln a limited flow dlS- 

threshold difference (DRIVE-RT-DRIVE-RP) from Figure 5b. tance (i.e., rs,200 km) according 
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to the reported location and the grid cells in the downstream stem river of the basin/subbasin below the 
reported location within a limited distance (i.e., ~100 km). When there are more than three grid cells flood- 
ing (according to the method in section 4.1 ) within the spatial domain for two continuous 3 h time intervals, 
we mark the entire area defined by the spatial domain as simulated flooding. 

According to the flood event matching method discussed above, the DRIVE-RP and DRIVE-RT detected 1820 
(87.2%) and 1 799 (86.2%) out of total DFO reported 2086 flood events over the entire study domain during 
the 1 1 year time period, respectively. The DRIVE-RP only has a slightly better performance than DRIVE-RT in 
detecting reported greater than 1 day flood events, but both of them have a much higher POD than that of 
the previous version of the GFMS (~60%) [Wu et al., 201 2a]. The POD for flood events of greater than 3 day 
duration is ~90%, as compared to ~80% for the previous system. 

In order to evaluate the GFMS performance in terms of false alarms, 38 well-reported areas (Figure 7, 
shaded yellow) are selected to further evaluate the flood detection performance POD, FAR, and CSI, 
together. This approach is used to minimize the impact of unreported floods, especially in sparsely popu- 
lated areas. Each of these well-reported areas, according to Wu et al. [201 2a], is defined as a limited spatial 
window (based on reported flooding location) having at least six reported floods during the 1 1 years. Figure 
7 shows the distribution of these well-reported areas in South-East Asia, for example, very similar to those 
identified using the reported flood inventory during a different time period (1998-2010) by Wu et al. 

[201 2a]. Well-reported areas are also defined for the other continents. The metrics of POD, FAR, and CSI vary 
across regions but with a generally consistent trend related to number of upstream dams. The dams (Figure 
7) are located according to the global large dam database [ Vorosmarty et al., 1997, 2003]. Figure 8 shows 
the statistical results for each well-reported area for floods longer than 3 days according to the DFO data. 
There are a total of 304 floods in this validation set. Along the bottom of the plots in Figure 8 are the num- 
ber of dams (from a more comprehensive Global Reservoir and Dam (GRanD) database) [ Lehner et al., 2011] 
in each area, increasing toward the right side of the diagrams. For example, both DRIVE-RP and DRIVE-RT 
results show that the FAR tends to increase along with the increasing of number of dams in the upstream 
areas (Figure 8). This trend is also clearly shown in Figure 7, in which FAR tends to be smaller where there 
are fewer or no large dams (dots) upstream of a well-reported area. The POD score tends to be higher in 
well dammed and well-reported areas, though the signal is not consistent as for FAR. These findings are 
consistent with and explained in detail in Wu et al. [2012a], Dams tend to result in more false alarms since 
the DRIVE model does not included dam/reservoir operation information at this time. 

The comparison between DRIVE-RP and DRIVE-RT results show very close performance for most of the 
selected well-reported areas indicating very similar precipitation information (in terms of ocurrence and rel- 
ative magnitude) in the upstream basins of these well-reported areas by TMPA RP and TMPA RT. Generally 
DRIVE-RP showed somewhat better performance than DRIVE-RT according to all metrics. DRIVE-RP provided 
an overall slightly better mean POD of 0.93, FAR of 0.84, and CSI of 0.1 5 for ail floods with duration greater 
than 1 day, compared to the DRIVE-RT with a mean POD of 0.90, FAR of 0.88, and CSI of 0.1 2 (Table 1 ). For 
floods with longer duration (i.e., >3 days), both DRIVE-RT and DRIVE-RP significantly decreased false alarms 
with a mean FAR of 0.73 and 0.65, resulting in higher CSI scores of 0.25 and 0.34, respectively (Table 2). Both 
DRIVE-RP and DRIVE-RT showed much better flood detection performance than the previous version of 
GFMS, which showed a mean POD of 0.70, FAR of 0.93, and CSI of 0.07 for floods with duration more than 1 
day, and a mean POD of 0.78, FAR of 0.74, and CSI of 0.23 for floods with duration more than three days 
[Wu et al., 2012a]. From Tables 1 and 2, the false alarm rates are significantly lower in WRAs with fewer 
dams than those with more dams. For floods more than three days in the 1 8 WRAs with fewer than five 
dams, the DRIVE-RP also showed an overall better mean POD of 0.92, FAR of 0.56, and CSI of 0.43, than the 
DRIVE-RT with a mean POD of 0.87, FAR of 0.66, and CSI of 0.32 (Table 2). The primary reason for improved 
detection results in the new system is surmised to be the improved runoff generation and routing with the 
DRIVE system, with a secondary factor possibly being improved precipitation estimation. 


4.4. Gauge Streamflow-Based Validation 

Streamflow is arguably the best variable to be used to evaluate the overall performance of a hydrologic 
model because it represents the integrated results from all upstream water and energy processes and 
streamflow observations are much more available than other hydrologic variables (e.g., soil moisture, sur- 
face runoff) with relatively lower bias in observations. We evaluated the DRIVE model performance for 


WU ET AL. 


©2014. American Geophysical Union. All Rights Reserved. 


2705 


®AGU Water Resources Research 


1 0. 1 002/20 13WR0 147 10 



Figure 8. The flood detection metrics (a) POD, (b) FAR, and (c) CSI across 38 well-reported areas for DRIVE-RP and DRIVE-RT results for all 
floods with duration greater than three days, against DFO flood inventory data during 2001-201 1 . The numbers of dams upstream of each 
well-reported area are listed along the X axis. 


streamflow simulation using observed streamflow data from 1 121 global river gauges from the GRDC data- 
base. The gauges were selected with the criteria: (1 ) gauge data have at least a 1 year length of daily time 
series during the validation period 2001-201 1; (2) the gauge can be well located in the DRT upscaled river 
network, which serves as the geo-mask for organizing all model input and output data, so that the gauge 
observations can accurately represent the runoff-concentration from its upstream drainage area; (3) the 
gauge upstream drainage area >200 km 2 ; (4) the gauges are not close to the study domain boundaries (lati- 
tude 50°N and 50°S), since these gauges cannot accurately represent their full upstream drainage basins. A 
program from the DRT algorithm package was used to geo-locate the original GRDC gauges in the model 
domain for evaluation. For each selected gauge, the difference in upstream drainage area of the gauge 
location between the DRT data set and the GRDC data set is less than 1 0%. The selected river gauges are 
widely distributed across the study domain and provide a good representation of the diverse hydroclimate 
regions, e.g., arid, semiarid, and humid regions (Figure 9). Flowever, east Africa, and south and west Asia 
(particularly the area between 46°E and 97°E) are somewhat underrepresented for this evaluation. 

Both DRIVE-RP and DRIVE-RT results for the same retrospective time period from January 2001 to December 
201 1 (1 32 months) were compared to observed daily streamflow data. Metrics including daily (N d ) and 
monthly (N m ) Nash-Sutcliffe coefficient (NSC) values, daily (R d ) and monthly (R m ) correlation coefficients, 
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and mean annual relative error (MARE), all cal- 
culated based on the simulated and observed 
time series of streamflow (m 3 /s). 

4.4.1. Overall Model Performance in 
Streamflow Simulation Over the Globe 

Overall, when compared against the observed 
daily streamflow data from 1121 GRDC gauges, 
the DRIVE-RP showed that 60% (675) of the 
gauges had positive monthly NSC with a mean 
of 0.39, and 29% (322) of gauges had monthly 
NSC greater than 0.4 with a mean of 0.57 (Table 
3). Meanwhile there were 38% (424) gauges 
having MARE within 30% with a mean of -0.3%. Good correlation between the model-simulated and observed 
streamflow time series at monthly scale exists in almost all the gauges with a mean correlation of 0.67. Figure 9 
shows the spatial distribution of the monthly NSC for the DRIVE-RP streamflow simulation results. It is shown in 
Figure 9 that the model has a generally consistent performance across different regions. Figure 10 shows the 
histogram distribution of the number of gauges with positive monthly and daily NSC metrics for DRIVE-RP and 
DRIVE-RT, which clearly indicates that DRIVE-RP outperforms DRIVE-RT at the monthly scale, while the differ- 
ence in the performance between the DRIVE-RP and DRIVE-RT is smaller at the daily scale. 

Model performance decreased, as expected, at the daily scale, e.g., 46% of the gauges with positive monthly 
NSC had negative daily NSC. Flowever, 58% (655) of gauges had correlation coefficients greater than 0.4 
between the model-simulated and observed streamflow at the daily scale with a mean of 0.57. The correla- 
tion is more important for flood event detection, in which the percentile-based skill mainly depends on the 
relative order of routed runoff (or streamflow) magnitudes [Wu et at., 201 2a]. The decrease of model skills at 
the daily scale is attributed to a combination of the precipitation input, model parameterization, and the 
human impacts. The TMPA RP precipitation contains an adjustment using available rain gauge data at the 
monthly scale, which does not provide significant positive impact on the submonthly variability of precipita- 
tion because the submonthly depends on the sequence of short-interval precipitation events from the satel- 
lites. The model parameters (e.g., surface roughness) tend to lead to larger time lag bias at smaller time 
scales, e.g., a too fast flood wave simulation will have much more negative impact on daily evaluation met- 
rics than on the monthly evaluation. Fluman impacts (particularly the effect of dam regulation) can signifi- 
cantly change the shape of the daily hydrograph of a natural river, while having less impact at seasonal 
scales. According to the global metrics (Table 3 and Figure 9), the DRIVE model including only natural proc- 
esses, driven by TMPA-RP precipitation and a priori parameter sets, shows an overall promising performance 
in reproducing streamflow for global rivers. 


Table 1. Flood Detection Verification Against the DFO Flood Data- 
base Over the 38 Well-Reported Areas (WRAs) for Floods With Dura- 
tion More Than 1 Day 

Metrics POD FAR CSI 

Metrics Averaged Over All the 38 WRAs 

DRIVE-RT 0.90 0.88 0.12 

DRIVE-RP 0.93 0.84 0.15 

Metrics Averaged Over the 20 WRAs With >5 Dam 

DRIVE-RT 0.93 0.92 0.08 

DRIVE-RP 0.94 0.90 0.10 

Metrics Averaged Over the 18 WRAs With <5 Dam 

DRIVE-RT 0.86 0.83 0.17 

DRIVE-RP 0.92 0.78 0.21 


The generally good performance of DRIVE-RP can also provide a measure for evaluating the potential of the 
real-time GFMS performance when using TMPA-RT precipitation input. From Table 3 the DRIVE-RT has a 
generally consistently lower skill than DRIVE-RP as expected, and with lower NSCs and correlation coeffi- 
cients at both daily and monthly scales, while also having larger MARE. Flowever, there were 215 gauges 
(19%) with positive daily NSC with mean of 0.16 and 474 gauges (42%) having good correlations (>0.4) 
between simulated and observed daily streamflow with a mean of 0.53. These types of variations in flood 

statistics that are a function of rainfall input 


Table 2. The Same as Table 1, But for Floods With Duration More 
Than 3 Days 

Metrics POD FAR CSI 

Metrics Averaged Over all the 38 WRAs 

DRIVE-RT 0.90 0.73 0.25 

DRIVE-RP 0.93 0.65 0.34 

Metrics Averaged Over the 20 WRAs With >5 Dam 

DRIVE-RT 0.93 0.80 0.19 

DRIVE-RP 0.94 0.73 0.26 

Metrics Averaged Over the 18 WRAs With <5 Dam 

DRIVE-RT 0.87 0.66 0.32 

DRIVE-RP 0.92 0.56 0.43 


indicate that improvement of the satellite pre- 
cipitation information will lead directly to bet- 
ter flood determinations. 

4.4.2. Seasonal and Regional Model 
Performance in Streamflow Simulation 

In order to further evaluate the variations of 
model performance in streamflow simulation, 
the same metrics as presented in section 4.4.1 
are derived based on the model results and 
observed data for different regions and sea- 
sons (Tables (3-5)). 
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Figure 9. DRIVE-RP model performance (monthly NSC) in reproducing monthly streamflow during 2001-201 1, when driven by TMPA RP research precipitation data, at 1 121 GRDC 
streamflow gauges across the globe. All GRDC gauges are shown as filled circles, while at each gauge if the model performance is of a positive value for monthly NSC, the gauge color 
turns into green or purple in accordance to the value of NSC. For clarity, six subregions (Figures 9a-9f) are blown up. 


Table 3 also shows the metrics calculated based on the full simulation time series (indicating the overall 
model performance) at several different latitude bands, i.e., deep tropics (10°S-10°N), subtropics (10°N- 
30°N and 10°S-30°S), midlatitudes (30°N-50°N and 30°S-50°S). To facilitate interpretation of the Table 3, 
for example, the percentage of gauges for which the DRIVE model showed positive daily NSCs is plotted for 
each latitude band, as seen in Figure 11, from which the DRIVE-RT showed clearly model skill decay from 
the deep tropics toward higher latitudes in both hemispheres, probably in response to the TMPA RT precipi- 
tation quality. Similar decays occurred for other metrics, e.g., for DRIVE-RT results there are 57% of stations 
with positive monthly NSC with mean N m of 0.36 in the deep tropics, dropping to 51% of gauges with a 
mean N m of 0.33 for northern subtropics and 25% gauges with a mean N m of 0.21 for northern midlatitudes 
(Table 3). The DRIVE-RP showed generally consistently better model performance over all these regions 
than the DRIVE-RT, and similar model skill decay toward higher latitudes can also be seen in the DRIVE-RP 
results in Table 3 and Figure 1 1 . Interestingly, this decay pattern was modified slightly (Figure 1 1 ) by the 
monthly gauge-based correction in the TMPA RP which leads to relatively better monthly scale performance 
in higher latitudes where more rain gauge data are available. For the northern midlatitudes there are 66% 
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Table 3. The Metrics for Model Performance in Streamflow Simulation, at Daily and Monthly Time Intervals for Continuous Years, 
Against 1121 GRDC River Gauges Across the Globe (— 50°S to 50°N) a 

Daily NSC Monthly NSC Correlation Coefficients 



N d >0 

N d > 0.4 

N m > 0 

N m > 0.4 

R d > 0.4 

R m > 0.4 

MARE <30% 

Global (~50°S to 50° N) With 7 727 Gauges 







% of gauges 

RP 

32 

4 

60 

29 

58 

99 

38 


RT 

79 

7 

32 

7 

42 

95 

27 

Mean metrics 

RP 

0.22 

0.52 

0.39 

0.57 

0.57 

0.67 

-0.3% 


RT 

0.16 

0.57 

0.27 

0.54 

0.53 

0.53 

-2.9% 

-10°Sto 10°N With 141 Gauges 








% of gauges 

RP 

44 

9 

62 

31 

76 

99 

44 


RT 

39 

6 

57 

22 

75 

98 

57 

Mean metrics 

RP 

0.25 

0.55 

0.41 

0.58 

0.64 

0.70 

-6.8% 


RT 

0.23 

0.60 

0.36 

0.58 

0.61 

0.66 

-5.5% 

10°N-30°N With 43 Gauges 








% of gauges 

RP 

30 

5 

54 

28 

51 

95 

37 


RT 

23 

2 

57 

79 

42 

95 

33 

Mean metrics 

RP 

0.17 

0.47 

0.41 

0.59 

0.58 

0.72 

-0.3% 


RT 

0.18 

0.54 

0.33 

0.60 

0.54 

0.60 

-0.6% 

30°N-50°N With 671 

Gauges 








% of gauges 

RP 

34 

4 

66 

31 

61 

99 

41 


RT 

77 

1 

25 

3 

39 

96 

24 

Mean metrics 

RP 

0.21 

0.52 

0.38 

0.56 

0.56 

0.66 

1.1% 


RT 

0.13 

0.53 

0.21 

0.50 

0.51 

0.45 

-7.2% 

— 10°S to —30°S With 191 Gauges 







% of Gauges 

RP 

28 

1 

52 

28 

59 

99 

34 


RT 

22 

0 

45 

77 

46 

98 

35 

Mean metrics 

RP 

0.17 

0.46 

0.30 

0.56 

0.54 

0.46 

2.0% 


RT 

0.11 


0.29 

0.50 

0.52 

0.56 

-4.9% 

-30°S to - 50° S With 75 Gauges 







% of Gauges 

RP 

21 

0 

44 

8 

5 

96 

20 


RT 

10 

0 

24 

0 

7 

88 

9 

Mean metrics 

RP 

0.05 


0.25 

0.46 

0.52 

0.57 

-9.2% 


RT 

0.01 


0.06 


0.44 

0.34 

6% 


a Metrics are listed for global and regional areas (from deep tropics to higher latitudes). The time period of daily streamflow gauge 
data ranges in 1-11 years. N d and N m stand for daily and monthly NSC, respectively. R d and R m stand for daily and monthly correlation 
coefficients, respectively. MARE is the mean annual relative error. 


gauges having positive N m with mean of 0.38 with DRIVE-RP, while for northern subtropics there were 54% 
(23 out of 43) gauges having positive N m with mean of 0.41 . 

The same metrics were also calculated for DRIVE-RP and DRIVE-RT results for these latitude bands but only 
based on summer (Table 4) and winter (Table 5) months, respectively. The metrics calculated based on full 
time series, summer-only and winter-only months (Tables (3-5)) indicate the same consistent relative model 
performance across different regions and between DRIVE-RP and DRIVE-RT. Seasonal metrics (Tables 4 and 
5) also show generally consistently better model performance in deep tropics and subtropics than midlati- 
tudes. Tables 4 and 5 also show generally larger water balance bias (MARE), and relatively lower monthly 


|(a) 


E 

3 40 


DRIVE-RP 
□ DRIVE-RT 




0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

Monthly NSC 


*5 60 


E 

a 40 


Tbf 


DRIVE-RP 
□ DRIVE-RT 


n 




0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

Daily NSC 


Figure 10. Histogram distribution of the number of gauges with positive (a) monthly and (b) daily NSC values for DRIVE-RP and DRIVE-RT simulation for 2001-201 1. 
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Table 4. The Same as Table 3 But for Summer Seasons (i.e., JJA Is Used for Deep Tropic and Northern Hemisphere While DJF Is Used for 
Southern Hemisphere) 

Daily NSC Monthly NSC Correlation Coefficients 


N d >0 

N d > 0.4 

N m > 0 

N m > 0.4 

R d > 0.4 

R m > 0.4 

MARE <30% 

— 10°Sto 10° N With 141 Gauges 
% of gauges RP 

14 

5 

31 

11 

51 

84 

33 

RT 

14 

4 

18 

8 

55 

86 

31 

Mean metrics RP 

0.32 

0.68 

0.32 

0.59 

0.65 

0.64 

-3.2% 

RT 

0.26 

0.48 

0.31 

0.53 

0.61 

0.61 

-2.7% 

10°N-30°N With 43 Gauges 
% of gauges RP 

19 

0 

28 

14 

37 

86 

23 

RT 

16 

2 

35 

12 

26 

84 

14 

Mean metrics RP 

0.10 


0.31 

0.54 

0.56 

0.65 

0.1% 

RT 

0.16 

0.43 

0.30 

0.52 

0.56 

0.62 

-1% 

30° N-50°N With 671 Gauges 
% of gauges RP 

25 

4 

43 

22 

58 

99 

25 

RT 

10 

1 

79 

3 

30 

92 

21 

Mean metrics RP 

0.22 

0.54 

0.41 

0.61 

0.56 

0.72 

1.3% 

RT 

0.16 

0.53 

0.25 

0.57 

0.52 

0.48 

-1.4% 

— 10°S to —30°S With 191 Gauges 
% of gauges RP 1 9 

0 

42 

19 

37 

93 

26.2 

RT 

13 

0 

26 

6 

19 

85 

31 

Mean metrics RP 

0.14 


0.37 

0.57 

0.51 

0.66 

-3.5% 

RT 

0.10 


0.26 

0.49 

0.48 

0.48 

1.4% 

—30°S to - 50° S With 75 Gauges 
% of gauges RP 

7 

0 

31 

8 

8 

72 

15 

RT 

8 

0 

11 

0 

3 

63 

11 

Mean metrics RP 

0.11 


0.27 

0.55 

0.52 

0.62 

-3.7% 

RT 

0.03 


0.06 


0.49 

0.37 

2.3% 


correlation coefficients in streamflow between gauge observations and simulations in winter seasons than 
summer seasons, indicating a relative less quality of satellite-based precipitation estimation for winter sea- 
sons. Although precipitation is not the only causation for the spatial variation of model performance, precip- 
itation is probably the primary one and its signature is clearly visible in the results. 


Table 5. The Same as Table 3 but for Winter Seasons (i.e., DJF Is Used for Deep Tropic and Northern Hemisphere While JJA Is Used for 
Southern Hemisphere) 

Daily NSC Monthly NSC Correlation Coefficients 


N d >0 

N d > 0.4 

N m > 0 

N m > 0.4 

R d > 0.4 

R m > 0.4 

MARE <30% 

-10°Sto 10°N With 141 Gauges 
% of gauges RP 

17 

3 

36 

14 

43 

87 

34 

RT 

15 

4 

26 

77 

23 

89 

37 

Mean metrics RP 

0.23 

0.55 

0.34 

0.57 

0.61 

0.62 

-2.8% 

RT 

0.24 

0.55 

0.31 

0.53 

0.60 

0.47 

-5.3% 

10°N-30°N With 43 Gauges 
% of gauges RP 

9 

0 

28 

9 

28 

75 

30 

RT 

14 

0 

26 

2 

27 

63 

28 

Mean metrics RP 

0.01 


0.25 

0.51 

0.56 

0.62 

1.2% 

RT 

0.04 


0.16 

0.45 

0.61 

0.54 

-2.3% 

30°N-50°N With 671 Gauges 
% of gauges RP 

22 

3 

34 

16 

48 

92 

39 

RT 

8 

7 

77 

3 

33 

78 

79 

Mean metrics RP 

0.02 

0.12 

0.40 

0.62 

0.55 

0.61 

-6.2% 

RT 

0.01 

0.07 

0.27 

0.57 

0.52 

0.49 

-5.5% 

— 10°S to —30°S With 191 Gauges 
% of gauges RP 7 

1 

10 

4 

28 

66 

15 

RT 

5 

7 

7 

3 

75 

56 

14 

Mean metrics RP 

0.02 

0.1 

0.31 

0.64 

0.60 

0.57 

3.0% 

RT 

0.01 

0.08 

0.23 

0.48 

0.52 

0.44 

-1.6% 

—30°S to —50°S With 75 Gauges 
% of gauges RP 

15 

0 

42 

19 

9 

85 

23 

RT 

15 

0 

27 

7 

8 

76 

77 

Mean metrics RP 

0.09 


0.30 

0.51 

0.45 

0.65 

-8.9% 

RT 

0.09 


0.22 

0.44 

0.47 

0.44 

-6.7% 


WU ET AL. 


©2014. American Geophysical Union. All Rights Reserved. 


2710 


®AGU Water Resources Research 


10.1 002/201 3WR01 4710 



Latitude bands 

Figure 1 1. The percentage of gauges in each latitude band (defined in the section 4.4.2) 
for which the DRIVE model showed positive daily NSCs using TMPA RP and TMPA RT pre- 
cipitation input. The X axis values are the central latitude for each band. 


Figure 12 shows an example of 
comparisons of model perform- 
ance between DRIVE-RP and 
DRIVE-RT in South America (pri- 
marily in the Amazon River Basin 
with relatively fewer dams) 
according to daily NSC and 
MARE. One can see that the 
DRIVE model shows very similar 
statistical performance in terms 
of reproducing observed daily 
streamflow time series and 
annual water balance when 


driven by TMPA RP or RT data. For this region (Figure 1 2) there were 76 gauges, out of total 205, showing a 
positive daily NSC with mean of 0.25 by the DRIVE-RP, while the DRIVE-RT derived 63 gauges with positive 
NSC with a mean of 0.22. There were 1 01 and 1 1 2 gauges with MARE <30% with mean of -2.3% and 
-5.4% by DRIVE-RP and DRIVE-RT, respectively. This indicated a generally good real-time GFMS perform- 
ance (relative to DRIVE-RP) for many areas. Note that all the results were derived from the DRIVE model 
without any further calibration. Appropriate calibration is expected to improve the model performance for 
many rivers particularly for those gauges (Figures 12c and 1 2d, among green and purple points) with 
model-calculated negative NSCs and relative higher MARE, but being within a reasonable range of error 
(e.g., NSC > — 1.0 and MARE within 50%). Of course, precipitation error reduction is probably even more 
important. 


80“W 60°W 40°W 



80°W 60’W 40’W 



80’W 60°W 40’W 



80°W 60°W 40°W 



80°W 60*W 40 a W 


10’N 
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10 a S 


10°S 


30’S 


30’S 


Figure 12. The daily (a) and (b) NSC, and (c) and (d) MARE metrics for the region of South America from (a and c) DRIVE-RP and (b and d) DRIVE-RT model results. 
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Figure 13. Examples of the simulated and observed hydrographs at two gauges. The gauge locations are indicated as filled circles in Fig- 
ure 6b. 


4.4.3. Examples of Simulated Hydrographs Against Observations 

Two GRDC gauges (locations indicated as dark points in Figure 6b) were selected as examples to show the 
simulated streamflow time series against observed hydrographs with monthly and daily intervals (Figure 
13). They were selected because they represent relatively natural river basins without dams and both 
DRIVE-RP and DRIVE-RT results show reasonable positive monthly and daily NSCs. The GRDC gauge 
1 577101 (8.38333N, 38.78333E) is on Awash River, Ethiopia, with a mean annual precipitation of 1 1 02 mm 
(according toTMPA RP observation from 1998 to 2012) for its upstream basin area of 7656 km 2 (presented 
by the DRT with 40 1 /8th degree grid cells). The gauge 3664100 (25.77389S, 52.93287W) is on Rio Chopim 
River, Brazil with a mean annual precipitation of 2 102 mm for its upstream drainage area of 6,756 km 2 (44 
grid cells). Figure 13 shows that the simulated hydrographs generally agree well against the observed 
hydrographs at both daily and monthly scales. DRIVE-RT results show systematically lower streamflow esti- 
mation than DRIVE-RP over the time period (2001 -2009) at the Ethiopian gauge. Flowever, at the Brazilian 
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Table 6. DRIVE Model Streamflow Simulation Performance at Two Selected Gauges 3 

N d 

N m 

Rd 

MARE 

GRDC 1577101 (2001-2009) 

DRIVE-RP 

0.35 

0.67 

0.62 

5.4% 


DRIVE-RP(— 1 day) 

0.35 

0.67 

0.63 

5.4% 


DRIVE-RP ( n = 0.035) 

0.45 

0.68 

0.67 

5.7% 


DRIVE-RT 

0.29 

0.40 

0.60 

-41% 


DRIVE-RT(— 1 day) 

0.30 

0.41 

0.61 

-41% 

GRDC 3664100 (2002-2005) 

DRIVE-RP 

0.28 

0.65 

0.55 

0.5% 


DRIVE-RP(— 1 day) 

0.48 

0.65 

0.69 

0.5% 


DRIVE-RP (n = 0.035) 

0.55 

0.64 

0.75 

0.6% 


DRIVE-RT 

0.17 

0.59 

0.53 

9.6% 


DRIVE-RT(— 1 day) 

0.43 

0.58 

0.68 

9.5% 


a n is the Manning roughness coefficient, which was used uniformly globally for both the dominant rivers and tributaries. The metrics 
were also calculated by delaying the simulated streamflow time series by 1 day which resulted in the maximum correlation coefficient 
between simulated and observed hydrographs. 


gauge, the DRIVE-RT and DRIVE-RP show very close results, while the DRIVE-RT estimated streamflow is 
overall slightly higher than that of DRIVE-RP. The streamflow biases (DRIVE-RT versus DRIVE-RP) at both 
gauges are consistent with the precipitation bias (TMPA RT versus RP, Figure 6b). 

The time delay (in days) was calculated, based on the daily values, to evaluate the errors related to the time 
lag between the simulated and observed hydrographs. The time delay was calculated as the time lag where 
the correlation coefficient between the daily simulated and observed time series is at a maximum [De Paiva 
et at., 201 3], Positive (negative) time delay values indicate delayed (advanced) simulated hydrographs. A 
negative 1 day time delay was found at the two gauge locations for both DRIVE-RP and DRIVE-RT simula- 
tions, indicating the DRIVE model has faster flood wave simulations than observed at these two locations. 
Table 6 shows the model performances at the two gauges under different scenarios. A 1 day delayed simu- 
lated hygrograph also resulted in significantly improved daily NSC metrics at the Brazilian gauge for both 
DRIVE-RP and DRIVE-RT. At this gauge, the original DRIVE-RT derived a daily NSC of 0.1 7 for the time period 
of 2002-2005, while the 1 day time lag corrected simulated hydrograph has a daily NSC of 0.43. As 
expected, a 1 day time lag has minor impacts on monthly and annual metrics at both gauges. The Ethiopian 
gauge statistics improve only slightly with the 1 day time lag adjustment indicating the timing error is 
smaller (at subdaily level) at this gauge, or that there are other effects. Simulated hydrographs that are too 
fast were found in many other locations. This general bias in timing may be related to the fact that a flood- 
plain module is not included in the current version of the DRIVE model and the calibration of channel 
geometrics-related parameters (particularly the Manning roughness and channel width parameters) is lack- 
ing. The constant Manning roughness value of 0.03 used in this study is probably too low for many river 
basins. A simple increase of the Manning roughness to 0.035 resulted in significant improvements in DRIVE- 
RP for both gauges (Table 6). Figure 14 shows the simulated and observed daily hydrographs (at gauge 
36641 00 [Brazil]) for a short time window as an example indicating the time delay error in the original DRIVE 
model simulation can be corrected through model calibration (here through a simple adjustment of the 
Manning roughness value). 

The two examples indicate that improved calibration and better model parameterization will improve both 
runoff generation and runoff-routing modeling and should be a focus for the future. The major magnitude 
difference usually happens in flood season, which may indicate a seasonal oriented calibration, in addition 
to a floodplain module, might be required for more accurate flood magnitude estimation. 


5. Discussion 

In this study, we use a deterministic model for the real-time flood monitoring. Uncertainties can lie in both 
the model itself and model inputs. Many factors such as quality of precipitation estimation, human activities 
(particularly through reservoir/dam regulation, irrigation withdraw, etc.), and model structure and parame- 
terization can significantly impact model performance. Specifically for this study, satellite-based precipita- 
tion used here has generally good quality in the tropics, but with relatively more quality issues in higher 
latitudes, cool seasons, and complex terrain; the DRIVE model in its current version does not include proc- 
esses for man-made structures and human flow regulation, which exist extensively over the globe; even 
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Figure 14. Example of hydrographs in a short time window (1 1 April 2005 to 31 December 2005) computed by the DRIVE-RP. The red 
curve stands for the original DRIVE-RP modeling with Manning coefficient of 0.03 for both stem river and subgrid tributaries; the black 
curve is from DRIVE-RP using a Manning coefficient of 0.035, while the green curve is negative 1 day corrected original DRIVE-RP simulated 
hydrograph. 


with only natural processes represented in the model, we have not performed any calibrations to tune the 
model toward reproducing better observations, though the model showed strong sensitivity to some 
parameters (e.g.. Manning roughness). However, calibration of the hydrologic model can be problematic, if 
the observed discharge falls within the uncertainty of the simulated discharge [ Biemans et al., 2009]. Calibra- 
tion efforts in the future have to be implemented after an uncertainty analysis with particular attention paid 
to precipitation uncertainty for flood applications. Given a global domain in this study, the dominance of 
uncertainty sources will also be spatially dependent. Further work is needed to develop technigues or 
deploy existing ones from the literature [e.g., Beven and Freer, 2001 ; Renard et al., 2011; Demirel et al., 201 3] 
for systematic uncertainty analysis. It is worth mentioning that the recent launch of the Global Precipitation 
Measurement (GPM) Core Observatory, a joint Earth-observing mission (as the follow-on of the TRMM mis- 
sion) between NASA and the Japan Aerospace Exploration Agency (JAXA) [Hou et al., 2013], provides a 
good opportunity for further investigation of the uncertainties in our real-time flood modeling work. The 
DRIVE model is a participating hydrologic model in the GPM's Ground Validation (GV) Program to investi- 
gate the effects of precipitation uncertainty on model results and the uncertainty propagation in hydrologic 
processes by deploying various existing precipitation products (both conventional and satellite-based). We 
will report the results of that effort in a later paper. 

Despite of the aforementioned uncertainties, we think the current model setup and evaluation results pro- 
vide a good basis for justification of the use of the GFMS for real-time flood monitoring, providing valuable 
information for flood analysis and for flood relief practice. Alfieri et al. [2013] recently performed a 21 year 
retrospective global hydrologic simulation driven by ERA-Interim reanalysis forcings at a 1/1 0th degree 
resolution. Their evaluation against streamflow observations at 620 GRDC gauges showed there were 58% 
of these gauges with positive daily NSC. In this study, we use satellite precipitation, and run the hydrologic 
model at 1 /8th degree resolution while evaluating the model performance using 1 121 GRDC gauges (with 
more gauges with smaller upstream areas and shorter data time length). In our model performance statis- 
tics, we did not remove the gauges with upstream reservoirs as done by Alfieri et al. The validation metrics 
of the two studies are comparable. We also assume the uncertainties involved would not change the 
spatial-temporal pattern of the validation metrics derived in this study. 


6. Summary and Conclusions 

An experimental real-time Global Flood Monitoring System (GFMS) using satellite-based precipitation infor- 
mation has been running routinely for the last few years with evaluations of previous versions [Yilmaz et al., 
2010; Wu et al., 2012a] showing positive results, but indicating areas for additional improvement. In this 
paper, we describe a new version of the system, present examples from the real-time system, and present 
an evaluation using a global flood event archive and streamflow observations. Real-time results from the 
system can be viewed at http://flood.umd.edu. For this new version of GFMS a widely used land surface 
model (LSM), the Variable Infiltration Capacity (VIC) model [Liang et al., 1 994, 1 996] is coupled with a newly 
developed hierarchical dominant river tracing-based runoff-routing (DRTR) model to form the dominant 
river tracing-routing integrated with VIC Environment (DRIVE) model system. The DRTR routing model is a 
physically based routing model running on a grid system with parameterization of each routing model 
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element (at either grid level or subgrid level) based on high resolution (1 km) hydrographic inputs through 
robust hierarchical DRT [Wu et al., 201 1, 2012b]. The VIC model was modified, for real-time flood simulation, 
from its original individual grid cell-based running mode to match the DRTR routing model structure with 
all grid cell calculations completed at each time step. 

Examples from the GFMS real-time system over the North India are used to describe the flood detection/inten- 
sity algorithm, time history of regional maps of this parameter and present example of streamflow calculations. 
The validation and analysis based on the recent flood events over the upper Mississippi valley from the GFMS 
real-time system demonstrated that the real-time GFMS had a fairly good performance in flood occurrence 
detection, flood evolution and magnitude calculation according to observed daily streamflow data. 

Results of 15 year retrospective calculations with the DRIVE system using research (TMPA-RP) and real-time 
(TMPA-RT) precipitation data sets indicate generally positive results. Global flood detection threshold maps 
based on the retrospective calculation of routed runoff at each grid location indicate a high level of correla- 
tion between the two rainfall data set inputs, with global and regional biases in the threshold related closely 
to differences in the mean rainfall. Using either rainfall data set the system detected about 87% of flood 
events of greater than 1 day duration across the globe. A further evaluation in 38 well-reported areas (to 
avoid under-reporting), also gave a POD of 0.90, with a false alarm ratio (FAR) of about 0.85 for flood events 
with duration greater than 1 day, which decreases to 0.70 for longer duration floods (greater than three 
days). Consistent with the findings of Wu et al. [201 2a] in an evaluation of the previous version of our sys- 
tem, dams tended to undermine model skill in flood detection by leading to more false alarms. According 
to the statistics for the 1 8 WRAs with fewer than five dams (i.e., the most natural basins in our global com- 
parison), the flood detection system being driven by the real-time precipitation information had a POD of 
0.87, FAR of 0.66, and CSI of 0.32 for floods with duration longer than three days. Somewhat better statistics 
were achieved using the research-quality precipitation information. In general, the new system provides 
improved statistics over the previous version of the GFMS when compared to the flood event inventory. 

This improvement is related primarily to the improved routing model and the use of a well-tested LSM 
(VIC), but also to some improvement to the real-time rainfall information. 

The system was also tested against global streamflow observations from the Global Runoff Data Centre 
(GRDC). Using the research satellite precipitation information gave results of positive daily and monthly NSC 
values for 32% and 60% of the gauges with a mean of 0.22 and 0.39, respectively, which is promising con- 
sidering the model was using only a priori parameters. The real-time precipitation data produced similar 
results in a parallel comparison, showing no significant difference at daily scale except in the northern mid- 
latitudes, where the research product produces better streamflow statistics than the real-time data, due to 
the positive influence of rain gauges in middle and higher latitudes. Validation using real-time precipitation 
across the tropics (30°S-30°N) gives positive daily Nash-Sutcliffe Coefficients for 107 out of 375 (28%) sta- 
tions with a mean of 0.19 and 51% of the same gauges at monthly scale with a mean of 0.33. Better model 
performance was noted in deep tropics and subtropics as compared to midlatitudes at monthly and daily 
scales. Analysis of individual observed versus simulated hydrographs indicated that the simulated flood 
wave generally leads the observations by 1 day in the mean for the two selected gauges, possibly related to 
the current channel hydraulic parameter configurations and lack of floodplain delineation. The model 
appears sensitive to the Manning roughness coefficients. A sensitivity test with an increased Manning coef- 
ficient significantly reduced the lag and increased the NSC. 

Uncertainties in the model inputs, model structure, and parameter sets, and evaluation data can introduce 
considerable uncertainties in the results of this study. We will investigate the uncertainty impacts on the 
flood estimation in future work, which is even more important in flood forecasting. Flowever, both the flood 
event-based and the streamflow gauge-based evaluation indicated that even with the current quality of 
satellite-based precipitation, the model performance can likely be improved through hydrologic model 
development, particularly to include floodplain and reservoir/dam effects in the routing model (to decrease 
the false alarms) and better model parameterization and regional calibration. The model calibration strategy 
requires consideration of the uncertainty effects, particularly from the precipitation forcing. In addition to 
these directions, high-resolution (1 km) routing and water-storage calculations are being implemented for 
global real-time calculations, as well as combining the satellite precipitation information with precipitation 
forecasts from numerical weather prediction models to extend the real-time hydrological calculations into 
the future. 
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